A Case Study in Tagging Case in German: An Assessment of Statistical Approaches
نویسنده
چکیده
In this study, we assess the performance of purely statistical approaches using supervised machine learning for predicting case in German (nominative, accusative, dative, genitive, n/a). We experiment with two different treebanks containing morphological annotations: TIGER and TUEBA. An evaluation with 10-fold cross-validation serves as the basis for systematic comparisons of the optimal parametrizations of different approaches. We test taggers based on Hidden Markov Models (HMM), Decision Trees, and Conditional Random Fields (CRF). The CRF approach based on our hand-crafted feature model achieves an accuracy of about 94%. This outperforms all other approaches and results in an improvement of 11% compared to a baseline HMM trigram tagger and an improvement of 2% compared to a state-of-the-art tagger for rich morphological tagsets. Moreover, we investigate the effect of additional (morphological) categories (gender, number, person, part of speech) in the internal tagset used for the training. Rich internal tagsets improve results for all tested approaches. DOI: https://doi.org/10.1007/978-3-642-40486-3_2 Posted at the Zurich Open Repository and Archive, University of Zurich ZORA URL: https://doi.org/10.5167/uzh-85713 Submitted Version Originally published at: Clematide, Simon (2013). A case study in tagging case in german: an assessment of statistical approaches. In: Mahlow, Cerstin; Piotrowski, Michael. Systems and Frameworks for Computational Morphology. Heidelberg New York Dordrecht London: Springer, 22-34. DOI: https://doi.org/10.1007/978-3-642-40486-3_2 A Case Study in Tagging Case in German: an Assessment of Statistical Approaches
منابع مشابه
The Quantification of Uncertainties in Production Prediction Using Integrated Statistical and Neural Network Approaches: An Iranian Gas Field Case Study
Uncertainty in production prediction has been subject to numerous investigations. Geological and reservoir engineering data comprise a huge number of data entries to the simulation models. Thus, uncertainty of these data can largely affect the reliability of the simulation model. Due to these reasons, it is worthy to present the desired quantity with a probability distribution instead of a sing...
متن کاملAn unusual case of nasal mucormycosis caused by Rhizopus oryzae in a German shepherd dog
This study represents an unusual case of mucormycosis localized in nasal cavity of a German shepherd dog. The patient was a 1-year-old male guard dog with unilateral nasal epistaxis, mucopurulent nasal discharge, sneezing and nose pawing. The dog had a history of head trauma about 2 months before admission, which was associated with mild self-limited epistaxis. Initial nasal rhinoscopy showed s...
متن کاملAssessment of surgical approaches of supracondylar fracture of elbow in children
the aim of this study was to assess different surgical approaches of supracondylar fracture in 31 children.the surgical approches consisted of anteromedial (16 cases),posterior (13 cases) and anterolateral (2 cases). the rate of persistent complications was higher in the posterior approach.the most common complication eas soft tissue atrophy in posterior of elbow (69%) followed by limitation of...
متن کاملAssessment of the completeness of Volunteered Geographic Information focusing on building blocks data (Case Study: Tehran metropolis)
Open Street Map (OSM) is currently the largest collection of volunteered geographic data, widely used in many projects as an alternative to or integrated with authoritative data. However, the quality of these data has been one of the obstacles to the widely use of it. In this article, from among the elements related to the quality of volunteered geographic data, we have tried to examine the com...
متن کاملAsymmetric Lumbosacral Transitional Vertebra (LTV) Type-3 in a German Shepherd Dog: A Case Report
Case Description- In the present study, a seven-year-old German shepherd female dog was referred to Veterinary Hospital of Shahid Chamran University of Ahvaz, with a two-week history of intermittent lameness and lumbosacral pain. Clinical Findings- On general examination, the vital parameters were within normal limits. A ventrodorsal (VD) radiograph of the pelvis and lumbosacral spine was take...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2013